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Abstract 

Many cellular behaviors are regulated by gene regulation networks, ki- 
netics of which is one of the main subjects in the study of systems biology. 
Because of the low number molecules in these reacting systems, stochas- 
tic effects are significant. In recent years, stochasticity in modeling the 
kinetics of gene regulation networks have been drawing the attention of 
many researchers. This paper is a self contained review trying to provide 
an overview of stochastic modeling. I will introduce the derivation of the 
main equations in modeling the biochemical systems with intrinsic noise 
(chemical master equation, Fokker-Plan equation, reaction rate equation, 
chemical Langevin equation), and will discuss the relations between these 
formulations. The mathematical formulations for systems with fluctua- 
tions in kinetic parameters are also discussed. Finally, I will introduce 
the exact stochastic simulation algorithm and the approximate explicit 
tau-leaping method for making numerical simulations. 



1 Introduction 

Systems biology is an interdisciplinary science of discovering, modeling, under- 
standing and ultimately engineering at the molecular level the dynamic rela- 
tionships between the biological molecules that define living organism^ This 
field is increasingly hot in recent decades as modeling molecular systems is not 
only fascinating but also possible in the post-genomics science. At the molecu- 
lar level, many cellular behaviors are regulated by genetic regulation networks 
in which the stochasticity is significant. To describe these stochastic chemical 
kinetics, stochastic modeling is highlighted recently [20l [29l [5T] . 

Chemical dynamics have been widely accepted to study chemical kinetics of 
reacting systems with large molecule populations, typically in the order of 10^^. 
In these systems, the kinetics are nearly deterministic and can be described 
by a set of ordinary differential equations-the reaction rate equations, or partial 
differential equations if spatial movements are taken into account. Nevertheless, 
in intracellular molecule kinetics, stochasticity is significant because the numbers 
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of each molecule species are very low. For instance, only one gene, either active 
or inactive, is involved in most activities of gene expressions. In an individual 
bacteria, there are less than 20 transcriptions of mRNAs from a single gene [32] . 
Typical molecule numbers of the same protein specie in a cell are usually no more 
than a few thousands. Thus, fluctuations in protein activities are significant 
due to the low-number effect. Reaction rate equations fail to describe these 
fluctuations. In this review, I will introduce main equations for the stochastic 
modeling of such systems, and discuss numerical methods used in stochastic 
simulations. 

This paper will first briefly review assumptions and notations for describing 
a biochemical system, followed by two examples of biological processes. Next, I 
will introduce several theoretical formulations for modeling biochemical system 
kinetics with merely intrinsic noise, followed by the mathematical formulations 
for the situation in which kinetic parameters are random. Finally, some stochas- 
tic simulation methods supported by the previous theories are introduced. The 
paper will be concluded with a summarization of the theoretical structure of 
stochastic modeling. 

2 Stochasticity in biological processes 

It is no doubt that biological processes are essentially random [TH I^Hl ISSl 133 
[551 HH I33J HHl [ST]. Both cellular behavior and the cellular environment are 
stochastic. Phenotypes vary across isogenic populations and in individual cell 
over time [51[ . Gene expression is a fundamentally stochastic process, with 
randomness in transcription and translation leading to cell-to-cell variations in 
mRN A and protein levels [TUl [HI [10] • Noise propagation in gene networks has 
important consequences for cellular functions, being beneficial in some contexts 
and harmful in others [HI \^ . 

Essentially, kinetics of biological molecules in living cells are consequences of 
chemically reacting systems. In this section, we first review basic assumptions 
and descriptions of general chemical systems, and then give two examples to 
demonstrate their applications. 

2.1 Chemical systems 

Consider a system of well-stirred mixture of iV > 1 molecular species {5*1, • • • , Sn}, 
inside some fixed volume f2 and at constant temperature, through M > 1 reac- 
tion channels {-Ri, • • • , Rm}- We can specify the dynamical state of this system 
by X(i) = (Ai(t),'- • • ,XNit)), where 

Xi{t) = the number of Si molecule in 

the system at time t, (t = 1, • • • , N). ^ ' 

We will described the evolution of X(t) from some given initial state X(to) = xg. 
It is obvious that X(t) is a stochastic process, because the time at which a 
particular reaction occurs is random. Therefore, instead of tracking a single 
pathway, our goal is to study the evolution of the statistical properties of system 
states. 

In this review paper, we always assume that each reaction, once occur, com- 
pletes instantaneously. This is to be distinguished with the systems involve 
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reactions with delay [15] ■ Further, we assume that the system is well stirred 
such that at any moment, each reactions occur with equal probability at any 
position. Under these assumptions, each reaction channel Rj associates with a 
propensity function Oj and a state-change vector Vj = (vji,--- ,WjAr), which 
are defined such that 

Ojijijdt = the probability, given X(t) — x, that one reaction Rj 

will occur somewhere inside O in the next infinitesimal (2) 
time interval [t,t + dt), {j = 1, ■ ■ ■ , M). 



and 



Vji = the change in the number of Si molecule produced 
by one Rj reaction, (j = 1, • • • , M; i ^ 1, ■ ■ ■ N). 



(3) 



The propensity function and the state-change vector together specify the reac- 
tion channel Rj. Therefore, the equations given below to describe the evolution 
of a biochemical system are derived from the propensity functions and stage- 
change vectors connected to the M reaction channels. 

The state-change vector of a reaction channel is easy to be obtained by 
counting the numbers of each molecule species that are consumed and produced 
in one reaction. For instance, if Ri were the reaction -I- 2S2 — >■ '2S3, then 
Vi = (-l,-2,2,...). 

Exact descriptions of propensity functions associate with the ad hoc stochas- 
ticity of deterministic chemical kinetics [32] , and have solid microphysical basis. 
In general, the function Oj have the mathematical form [18] 

aj{x) = Cjhjix). (4) 

Here Cj is the specific probability rate constant for the channel Rj, which is 
defined such that Cjdt is the probability that a randomly chosen combination 
of Rj reactant molecules will react accordingly in the next infinitesimal time 
interval dt. This probability Cjdt equals the multiple of two parts, the probability 
of a randomly chosen combination of Rj reactant molecules will collide in the 
next dt, and the probability that a colliding reactant molecules will actually 
react according to Rj. The first probability depends on the average relative 
speeds (which in turn depends on the temperature), the collision sections of a 
reactant molecules, and the system volume il. The second probability depends 
on the chemical energy barrier A/i of the reaction Rj (Figure [ij, and usually 
associate with temperature through a Boltzmann factor q-'^ij-I^bT ^ where fc^ is 
the Boltzmann constant. 

The function hjiTi) in (jj) measures the number of distinct combinations 
of Rj reactant molecules available in the state x. It can be easily obtained 
from the reaction Rj. For example, in the above reaction Ri, we would have 
hj{x.) = x\X2{xi — l)/2, which give the number of combinations to select one 
S\ molecule from x\ of them, and two 5*2 molecules from xi of them. More 
examples are given below. 

In general, for a chemical reaction 

Rj : rujiSi + • • • + rrij^SN ~^ njiSi + ■ ■ ■ + Uj^SN, (5) 
we would have 

N 



Vji — Uji — TOji, "jv^; >^ I I i7 rr 



^J-^ mjk\{xk ~ rrijky. 
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Figure 1: Schematic of chemical energy for the reaction O2 + 2H2 2H2O. 
If for any k and j, have Xk ^ rujk, then approximately 

N 
k=l 

The reaction rate constant Cj can only be obtained from experiments, and usu- 
ally depends on the system volume and the temperature. 

In real systems, most reaction channels are either monomolecular or bi- 
molecular reaction. For a monomolecular reaction, the reaction rate constant 
Cj is independent of the system volume $7. For a bimolecular reaction, the rate 
constant Cj is inversely proportional O. Trimolecular reactions do not physi- 
cally occur in dilute solutions with appreciable frequency. One can consider a 
trimolecular reaction as the combined result of two bimolecular reactions, and 
involved an additional short-lived species. For such an "effective trimolecular" 
reaction, the approximate Cj is proportional to [T5] . 

2.2 Examples in gene regulations 
2.2.1 Gene expression 

Gene expression is a basic biological process. Reactions in gene expression 
include promoter activity and inactivity, transcription^ translation^ and decaying 
of mRNA and proteins. Typical steps in gene expression are illustrated in Figure 
[H (also refer [32 )■ Note that transcription is a process of mRNA synthesis as 
specified by the gene, in which only the information is read out and the gene 
is not consumed. Similarly, proteins are translated from a mRNA sequence 
according to the genetic code, and the niRNA is not consumed in this process. 

In many gene regulations, the transition between active and inactive pro- 
moter states are regulated by a proteins (activator or repressor). In the case 
of activation, the activator bind to the inactive promoter to enhance the gene 
expression. The reaction channels Ri and R2 become 

i?i : ATs + Xi X2, R2 ■ X2 — > Xi + A"5, 

where X^ stands for the number of the activator. The corresponding propensity 
functions and state-change vectors are ai(X) — ciAriA"5,a2(X) = C2A"2, Vi = 
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Figure 2: (From ref. [21]) A model of gene expression. Each step represents the 
biochemical reactions which associate with transition between promoter states, 
production and decaying of mRNAs and proteins (here C3 > C4). 



(—1, 1, 0, 0, —1), and V2 = (1, —1, 0, 0, 1). Similarly, in the case of repressor, we 
should have 

Ri : Xi — )• X2 + X5 , i?2 : X^ + X2 — > Xi , 
andai(X) = ci^i, a2(X) = C2X2X5, vi = (-1, 1, 0, 0, 1), and V2 = (1,-1,0,0,-1). 

2.2.2 Circadian oscillator 

Figure [3] shows a simple model of circadian oscillator based on a common pos- 
itive and negative control elements found experimentally [57j . In this model, 
two genes, an activator A and a repressor R, are transcribed into mRNA and 
subsequently translated into protein. The activator A binds to both A and R 
promoters to increase their transcription rates. The protein R binds to and 
sequester the protein A, and therefore acts as a repressor. Figure [3] shows the 
propensity functions, and stage-change vectors of this model (c^ in the Figure 
gives the rate constant of the reaction channel Ri). Note that in the reaction 
channel Riq, the complex breaks in to R because of the degradation of A. Thus 
i?i6 is not the reversion process of R15. 

3 Mathematical formulations— intrinsic noise 

First, we assume that the propensity functions are time independent, i.e., the 
reaction rates Cj in (|4]) are constants. In this situation, the fluctuations in the 
system are inherent to the system of interest (intrinsic noise). The opposite 
case is extrinsic noise, which arises from variability in factors that are consider 
to be external. Mathematical formulations for extrinsic noise will be discussed 
in next section. 

3.1 Chemical master equation 

From the propensity function given in previous, the state vector X(t) is a jump- 
type Markov process on the non-negative A^-dimensional integer lattice. In 
following analysis of such a system, we will focus on the conditional probability 
function 

P(x, t|xo, to) = Prob{X(i) x, given that X(to) = xq}. (6) 
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Figure 3: (From ref. [S7]) A biochemical network of the circadian oscillator 
model. 



Hereinafter, we use an upper case letter to denote a random variable, and the 
corresponding lower case letter for a possible value of that random variable. 

Through the probability function, the average or expectation value of any 
quantity /(X|xo,to) defined on the system is given by 

(/(X|xo, io)> = / ^1^0, io). (7) 

X 

Physically, in an ensemble of identical systems starting from the same initial 
state X(to) = xq, the function P(x, t|xo,to) gives the fraction of subsystems 
with state X(t) = x. 

To derive a time evolution for the probability function P(x, t|xo, io)j we 
take a time increment dt and consider the variation between the probability of 
X(t) = X and of X(t + dt) = x, given that X(to) = xq. This variation is 

P(x, t + (ii|xo, to) ~ -P(x, t|xo, ^o) — Increasing of the probability in dt 

Decreasing of the probability in dt 

\ (8) 

We take dt so small such that the probability of having two or more reactions in 
dt is negligible compared to the probability of having only one reaction. Then 
increasing of the probability in dt occurs when a system with state X(i) = x— Vj 
reacts according to Rj in (t,t + dt), the probability of which is aj(x — Wj)dt. 
Thus, 

M 

Increasing of the probability in dt — P(x — Vj, t|xo, to)aj(x — Vj)dt. (9) 
Similarly, when a system with state X(i) = x reacts according to any reaction 
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channel Rj in {t,t + dt), the probabihty P(x, t|xo,to) wiU decrease. Thus, 

M 

Decreasing of the probabihty in dt — ''^^ P{x,t\xQ,to)aj(x}dt. (10) 

Substituting © and (HU]) into we obtain 

M 

P{x,t + dt\xo,to) - P{x,t\xQ,to) = ^P(x- Vj,t|xo,to)aj(x- Vj)di 

A/ 

^ X! *l^0' W^*' 
which yields, with the limit dt — > 0, the chemical master equation [18L 156] : 

— P(x,i|xo,to) = X![^(^^'^J'*I^O'^o)°^j(^^'^j) ^-^(^'^I^O'^o)°^j(^)]- (^^) 

The equation (jlip is an exact consequence from the reaction channels char- 
acterized by propensity functions and stage-change vectors. If one can solve 
(|lip for P, we should be able to find out everything about the process X(i). 
However, such an exact solution of ((TT|) can rarely be obtained (refer [371 HO] 
for the examples of solving the chemical master equation analytically). 

In fact, the chemical master equation (jlip is a set of linear differential equa- 
tions with constant coefficients. The difficult on solving the equation (1111) comes 
from the high dimension, which equals the total number of possible states of the 
system under study. For example, for a system with 100 molecules species, each 
has two possible states {Xi{t) = or 1), the system has totally 2^^°" possible 
states, and therefore the equation (|lip constants 2^*^° equations! Even solving 
such a huge system numerically is a big challenge. 

3.2 Fokker-Plank equation 

If in a biochemical system, all components of X(i) are very large compared to 1, 
we can regard the components of X(t) as real numbers. We assume further that 
the functions /j(x) = aj(x)P(x, t|xo, io) are analytic in the variable x. With 
these two assumptions, we can use Taylor's expansion to write 

/,(x-v,)=/,(x)+ 5: n^b^ (-;'7,(x)). (12) 

|m|>li=l ^ 

Here m = (mi, • • • , mAr) € Z^, |m| — nii + ■ ■ ■ + m^- Substituting (|T2|) into 
(llip . we immediately obtain the chemical Kramer-Moyal equation (CKME) [18] 

^^(-'^)-En^(^) (x)p(x,t)), (13) 

|m|>li=l ^ ^ 

where 

M 

^'"^■■■■■'""(x)-Ei',7---«,T«^W- 
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Hereinafter, we omit the initial condition (xo,io)- If all functions /j(x — Vj) 
are smooth, the equation would equivalent to PT|) . Therefore, the chemical 
Kramer-Moyal equation is a "semi-rigorous" consequence of the chemical master 
equation pT|) . 

If we truncate the right hand side at |m| = 2, we obtain the chemical Fokker- 
Plank equation (CFPE) 



i=l l<i,k<N 



where 

1/ A/ 

Ai(x) = Ewjiaj(x), Bife(x) = ^VjiVjkaj{:>c). (15) 

We will re-obtain this chemical Fokker-Plank equation below from the chemical 
Langevin equation. 

3.3 Reaction rate equation 

If we multiply the chemical master equation (jlip through by Xi, and sum over 
all X, we obtain the following chemical ensemble average equation (CEAE) 

^=E«..(«.(X)) (z = l,2,...,7V). (16) 

Equation (|16p is an exact consequence of pT|) . Nevertheless, is not a 
close equation system unless all reaction channels are monomolecular. When all 
reaction channels are monomolecular, the propensity functions aj(X) are linear, 
and therefore 

(a,(X)) =a,((X)) {j = l,---,N). (17) 

Thus, the populations evolve deterministically according to a set of ordinary 
differential equations 

^^J2^^,a,iK) (^ = l,...,iV), (18) 

where the components of x(i) are now considered as real variables. 

In usual, the mean populations do not evolve according to when there 
are higher order reactions. However, (|18p is sometimes heuristic if we assuming 
that the fluctuations are not important and (|17p holds approximately. Conse- 
quently, the ordinary differential equations (jl8p also hold under this assumption. 
The equation (|18p is referred as the macroscopic reaction rate equation(KKE) , 
or chemical rate equation in some literatures. 

The reaction rate equation (|18p is often written in terms of the species con- 
centrations 

Z,{t) = XS)/^l {i^l,---,N). 
The "concentrations" form of the reaction rate equation has form 

^ ^Y.^^^~a,{^) (z^l,...,7V), (19) 
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where 

dj{z) ~ aj{flz)/n. 

While we examine the dependence in the propensity functions for monomolec- 
ular, bimolecular, and trimolecular reactions, we would find that the function dj 
is functionally identical to aj except that the rate constants Cj has been replaced 
by the reaction rate constants kj . 

The reaction rate equations are most commonly used in modeling biochem- 
ical reacting systems. However, this equation fails to describe the stochastic 
effects which can be very important in biological processes. We will introduce 
the chemical Langevin equation below that has a facility to describe the stochas- 
ticity, and easy to be studied, at least numerically. 

3.4 Chemical Langevin equation 

The chemical Langevin equation was derived to yield an approximate time- 
evolution equation of the Langevin type. The derivation of the equation, given 
by Gillespie, is based on the chemical master equation and two explicit dy- 
namical conditions as detailed below. Most of the following refer to Gillespie's 
original paper [T8] . 

Suppose that the state of a system at current time t is X(<) = x. Let 
Kj{yi,T) (r > 0) be the number of Rj reactions that occur in the subsequent 
time interval [t, t+r]. Since each of these reactions will change the Si population 
by Vji , the number of Si molecule in the system at time t + r will be 

M 

X,{t + T)^x^+Y,Kj{x,T)vj^, {i = l,---,N). (20) 

We note that Kj(j>c, r) is a random variable, and therefore Xi{t + r) is random. 

We will obtain an approximation of Kj (x, r) below by imposing the following 
conditions: 

Condition (i) Require r to be small enough such that the change in the state 
during [t, t + r] will be so slight that non of the propensity functions 
changes its value "appreciably" . 

Condition (ii) Require r to be large enough that the expected number of 
occurrences of each reaction channel Rj in [t, t + r] be much larger than 
1. 

From condition (i), the propensity functions satisfy 

aj{X{t')) « aj(x), \ft' e [t,t + T],\fj e [1,M]. (21) 

Thus, the probability of the reaction Rj to occur in any infinitesimal interval dr 
within [t, t+r] is aj{x.)dT. Thus, Kj{x, r), the occurrence of "events" of reaction 
Rj in the time interval [t,t + r], will be a statistically independent Poisson 
random variable, and is denoted by Vj {oj (x) , r) . So (^0)) can be approximated 

by 

M 

J!:,(i + T) -K^w,,Pj(a,(x),T), (^ = l,•••,iV) (22) 
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according to condition (i). 

The mean and variance of V{a, r) are 

{V{a, r)) = var{7'(a, r)} = ar. (23) 

Thus, condition (ii) means 

a,(x)T»l, Vje[l,Af]. (24) 

The inequahty p4p aUows us to approximate each Poisson random variable by 
a normal random variable with the same mean and variance. This leads to 
further approximation 

M 

Xi{t + t)^x, + Y, VjiJ\fj(aj{x)T, aj(x)T), (i = 1, • • • , iV) (25) 

i=i 

where J\fj (m, ct^) denotes the normal random variable with mean m and variance 
cr^. Here the M random variables are independent to each other. Notice that in 
the above approximation, we have converted the molecular population Xi from 
discretely changing integers to continuously changing real variables. 

We note the linear combination theorem for normal random variables, 

A/-(m,cr2) =TO + crA/'(0, 1), (^26) 

the equation (|25p can be rewritten as 

M M 

Xi(t + r) = + ^ Vj.aj (x)t + ^ Vji ^ aj (x)rAA,- (0,1) (i = 1, • • • , iV). (27) 

Now, we are ready to obtain Langevin type equations by making some purely 
notational changes. First, we denote the time interval t by dt, and write 

dX, = X,{t + dt)-Xi{t). 

Next, introduce M temporally uncorrelated, independent random process Wj (i), 
satisfying 

dWj (t) = Wj (t + dt) - Wj (t) = TVj (0, 1) Vrfi (j = 1, • • • , iV). (28) 

It is easy to verify that the processes Wj (t) have stationary independent incre- 
ments with mean 0, i.e., 

{dWj{t))=0, {dW^{t)dWj{t')) ^ d^jS{t~t')dt, yi <i,j < M,yt,t'. (29) 

Thus, each Wj is a Wiener process (or referred to as Brownian motion) [56]. 
Finally, recalling that x stands for X(t), the equation (^5]) becomes a Langevin 
equation (or stochastic differential equation) 

M M 

dX, = ^ Vj.aj {X)dt + Vj, ^aj{X)dWj (i = 1, • • • , iV). (30) 
j=i i=i 
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The stochastic differential equation ([50)) is the desired chemical Langevin equa- 
tion (CLE). The solution of ([50]) with initial condition X(0) — Xq is a stochastic 
process X(t) satisfying 



M M 



X(i) = Xo + X] / ^',.a,(X(s))ds + / i;,,Ja,(X(s))dW^,(s). (31) 

Here /io integral is used as the stochastic fluctuations are intrinsic |56j . 

In the above, it is obvious that the conditions (i) and (ii) are contradict 
to each other. It may be very well happen that both conditions cannot be 
satisfied simultaneously. In this case, the derivation of the chemical Langevin 
equation may fail. But there are many practical circumstances in which the 
two conditions can be simultaneously satisfied. As the inequality (j24p implies 
that aj(x) is large when r is small enough as required by condition (i). This 
is possible when the system has large molecular populations for each molecule 
species since aj (x) is typically proportional to one or more components of x. 

Even when the conditions (i) and (ii) cannot be satisfied simultaneously, the 
chemical Langevin equation ((30)) is still useful. As we will discuss below. 



3.5 Discussions about the chemical Langevin equation 

In many intracellular biochemical systems such as gene expressions and genetic 
networks, the molecule populations are small, and the reactions are usually slow. 
In these systems, the derivation of the chemical Langevin equation may fail. 
Nevertheless, the equation (150]) is still useful for such systems as it can provide 
reasonable descriptions for the statistical properties of the kinetic processes. 
The reasons are given below. 

3.5.1 Time-evolution of the mean 

If we take average to both sides of ([50)) . noticing 

(y^a,(X)dI^,) =0 (j = l,---,M) 
according to the Ito interpretation, we have 

^ = E^..(«.(X)) (* = l,---,iV). (32) 

This gives the same form of chemical ensemble average equation (|T5)) as we have 
obtained from the chemical master equation. 

3.5.2 Time-evolution of the correlations 

The correlations between the molecule numbers are defined as 

a,,{t)^{X,{t)X,{t)}~{X,{t)}{X,{t)), {l<t,j<N). (33) 

Multiply the chemical master equation (|lip through by XiXj and sum over 
all X, we obtain 

^^^^^ = J2 [{X^VkJak{■yi)) + {X,Vk^ak{■K)) + {vk^VkJak{■K))] . (34) 
fc=i 
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Furthermore, (jl6p gives 



M 



dt 



= [{vMakiX)){X,) + {vkjakiX)){X,)] . (35) 



fe=i 

Thus, recalhng (|15p . we have the chemical ensemble correlations equation (CECE) 



dt 



= {A{X){X, - (X,))) + {{X, - {X,))A,iX)) + (By(X)). (36) 



The equation p6p is exact from the chemical master equation. 

Now, starting from the chemical Langevin equation (I30p and applying the 
Ito formula ^38], we have 

diX.Xj) = X,(dX,)+X,(dX,) + (dX,)(dXj) 

M 



^[XjUfciafc(X) + XiWfcjafe(X)Xj + VktVkjaki'X.)]dt 

L 

M 

^[Xiffcj A/afc(X) + XjVkj A/afc(X)]dVKfc + o(dt) 



fc=i 

M 



Taking the average to both sides, we re-obtain which yields the same form 
of chemical ensemble correlations equation ([55]) . 

At states of near equilibrium, if the random fluctuations are not important, 
we have approximately 



N 



A=(^Mm), S = (i3,-((X))), (37) 



MX) « M{X)) + -^Y^{Xi (Xi)), (B.,(X)) « i?,,((x)) 
1=1 ' 

Substituting above approximations into (1551) . and defining the matrices 

I dX, 
we have 

^ = (^a + a^^) + S. (38) 

Equation (j38p gives the linear approximation of the time-evolution of the 
correlation functions near equilibrium. This an example of the well known 
fluctuation-dissipation theorem |56j . 

3.5.3 Fokker-Plank equation 

Consider the chemical Langevin equation (PU)) . the solution X(t) satisfies 

M M 

X,{t + At) = X,,{t) + Yvk^aJ{X)At + Yvk^Vak{X)AWk{t), (i - 1, • • • ,iV) 
fe=i fc=i 

(39) 

where 

AWk{t) = Wk{t + At)-Wk{t). 
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Thus, assuming X(<) = x, the displacement Ax = X(< + At) — X(t) satisfies 



M 



(Axi) = y^t;fcjQfc(x)At 



k=l 

and 

M 

(AxiAxj) = VkiVkjak{y^)At. 



k=l 



Let P(x, t) to be the conditional probability density function as previous, 
then P(x, t) satisfies 

P(x,i + dt)-F(x,t) / P(x- Ax,t)VK(Ax,dt;x- Ax,t)dAx 

- / P(x,t)VF(Ax,dt;x,t)dAx, 

where VF(Ax, dt; x, t) is the transition probability from X(t) = x to X(/: + (it) = 
X + Ax. Applying Taylor expansion to the function 



noticing 



and 



we have 



h{x - Ax) = P(x - Ax, t)W{Ax, dt; x - Ax, t), 
/ A2;,W^(Ax, dt; x, t) = (Ax,), 

JAxGR" 

/ Ax.,AjW{Ax, dt; x, t)dAx = {AxiAxj), 

JAxGR" 



Pix,t + dt)-P{x,t) = — [P(x,t)(Ax,)] 

4=1 ^' 

l<jj<W ' ^ 

Thus, letting dt (here At = dt), we re-obtain the chemical Fokker-Plank 
equation as previous. 

Thus, the chemical Langevin equation and the chemical master equation 
yield the same form of chemical Fokker-Plank equation. 



3.5.4 Remarks 

In comparing the chemical Langevin equation and chemical master equation, 
we should note the following: 

1. In this section, we have shown that the chemical Langevin equation and 
chemical master equation yield the same forms of chemical ensemble aver- 
age equation (jl6p , chemical ensemble correlation equation p6p , and chem- 
ical Fokker-Plank equation (fH)) . 
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2. Despite the same forms of equations and ([SS]). since the averages are 
taken with respect to probabihty functions P(x, i), which are obtained 
from cither the solutions of the chemical master equation or the chemical 
Langevin equation, the two equations (CLE and CME) do not always 
yield the same dynamics of ensemble average and correlation. These are 
possible when all reaction channels are monomolecular, in which cases the 
two equations and ([5^ are, as we have discussed before, close. 

3. The chemical Langevin equation and chemical master equation yield the 
same form of chemical Fokker-Plan equation, which is a close equation. 
Thus, up to the second order approximation, the two equations give the 
same time-evolutions of the probability function P(x, t). 

4 Mathematical formulations— fluctuation in ki- 
netic parameters 

In the previous discussion, we assume that the reaction rates Cj in ^ are 
constants. This is only a rough approximation of the real world, in which cell 
environments are stochastic, and therefore the reaction rates are random. Here, 
we will discuss the mathematical formulations for situations in which there are 
fluctuations in kinetic parameters. 

4.1 Reaction rate as a random process 

When the reaction rate Cj is random, the previous propensity function aj (x) in 
(HI) should be rewritten as 

a,(x,i) -c,(t)/ij(x). (40) 

Here Cj{t), as previous, is the specific probability reaction rate for channel Rj at 
time t. This rate is usually random, depending on the fluctuating environment 
whose explicit time dependence is not known. A reasonable approximation is 
given below. 

In previous discussions, we have obtained Cj oc e"'^^^/'^^-^. If there are noise 
perturbations to the energy barrier, we can replace Afij by Ap,j — kBTrij[t), 
where rij(t) is a stochastic process. Consequently, we can replace the reaction 
rate Cj(t) by 

c,(<)=c,e''^W/(e''^(*)), (41) 

where Cj is a constant, measures the mean of the reaction rate. 

In many cases, the noise perturbation ri[t) (here we omit the subscript j) can 
be described by an Ornsterin- Uhlenbeck process, which is given by a solution of 
following stochastic differential equation 

dr] = ~{ri/T)dt + {a/T)dW. (42) 

Here W is a Wiener process, t and a are positive constants, measuring the 
autocorrelation time and variance, respectively. It is easy to verify that ri(t) is 
normally distributed and has an exponentially decaying stationary autocorrela- 
tion function [m [51 ESI [SB] 

ivrnt')) = Sje-l*-*'!/^ (43) 
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The Ornstein-Uhlenbeck process is an example of color noise. When r — ^ 0, 
r](t) approaches to Gaussian white noise. 

With ri{t) an Ornstein-Uhlenbeck process, the stationary distribution of the 
reaction rate Cj (t) is then log-normal. Log- normal distribution have been seen in 
many applications 1341 . For instance, log- normal rather than normal distribution 
have been measured for gene expression rates |451 152] . 

When ri{t) is an Orstein-Uhlenbeck process, we have [5] 

(e-'W) =e"'/(4-). (44) 

Therefore, the reaction rate (|4T]) can be rewritten as 

Cj{t) = Cje''^(*)-'^?/(4^^). (45) 

Thus, the propensity function (^0]) now becomes 

aj(x, t) = c,e''^(*'-"^'/(4"^)/ij(x), (46) 

with r^jit) satisfies an equation of form (|42|l . 

Next, we will introduce generalizations of the reaction rate equation and the 
chemical Langevin equation to describe reaction systems with fluctuations in 
reaction rates. We note that in this situation, the state variable X alone is not 
enough to describe the state of a system, and the state of environment has to be 
taken into account as well. Thus, to generalize the chemical master equation or 
the Fokker-Plank equation, we should replace the previous probability function 
with P(x, c,t|xo,Co,t), where c = {ci, • • • ,cm}- This consideration is omitted 
here. 

4.2 Reaction rate equation 

Substituting the propensity functions of form (051) and into the reaction 
rate equation (|18p . we obtain the following stochastic differential equations 

dX, = «,,c,e''^-'?/(4^^)/ij(X) j dt, (z = 1, . . . , TV) (47) 

dry, = ^[^,|Tj)dt + {G,|T,)dW,, (j = l,---,M) (48) 

Here Cj,Tj,aj are positive constants. The equations (|Tf)) -(H5 )) generalize the 
reaction rate equation to describe the reaction system with fluctuations in the 
kinetic parameters. 

4.3 Chemical Langevin equation 

Similar to the above strategy, substituting the propensity functions of form (j46p 
and (|^ into the chemical Langevin equation ((5D|) . we obtain the following 
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stochastic differential equations 

M 



dX, = \^Vj,Cje'^'"'o''^^^^^hj{lC,\dt (49) 



M 



1 /2 

+ "j'* [cje-'^^-^y^^^^^hjiX)^ dWj, {i^l,...,N) 

rf'y, = -(r;,/r,)dt+(a,/r,)dW^j, (j = l,---,M) (50) 

Here Wj are independent Wiener processes, and cj, Tj, aj, as previous, are pos- 
itive constants. Tliis set of generalized chemical Langevin equations describes 
the stochasticity of chemical systems with both intrinsic noise and fluctuations 
in kinetic parameters. 



5 Stochastic simulations 

We have introduced several equations for modeling systems of biochemical re- 
actions. Here, we will introduce stochastic simulation methods that intend to 
mimic a random process X(t) of a system evolution. Once we have enough 
sampling pathways of the random process, we are able to calculate the proba- 
bility density function P(x, t) and other statistical behaviors including the mean 
trajectory and correlations. 



5.1 Stochastic simulation algorithm 

Assume the system in state x at time t. We define the next-reaction density 
function p{t, j\x.,t) as 

p(r, j|x, t)(ir = probability that, given X(t) = x, the next 

reaction in f2 will occur in the infinitesimal time 
interval + r, t + t + dr) , and will be an 
Rj reaction. 

An elementary probability argument based on the propensity function ^ yields 

mm 



j|x,t) = aj(x)exp(-ao(x)T), (0 < r < oo, j = 1, • • • , M) (52) 

where 

M 
k=l 

This formula provides the basis for the stochastic simulation algorithm(SSA) 
(also known as the Gillespie algorithm) . 

Following direct method is perhaps the simplest to generate a pair of numbers 
(r, j) in accordance with the probability ([5^ [TB] : First generate two random 
numbers ri and r2 of uniform distribution in the unit interval, and then take 



r 



= (l/ao(x))ln(l/ri) (53) 



j ~ the smallest integer satisfying aj'(x;) > r2ao(x). (54) 
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Below are main steps in the stochastic simulation algorithm (for details, refer 



1. Initialization: Let X = Xq, and t = to- 

2. Monte Carlo step: Generate a pair of random numbers (r, j') according 
to the probability density function (15^ (replace x by X). 

3. Update: Increase the time by r, and replace the molecule count by X+Vj . 

4. Iterate: Go back to Step 2 unless the simulation time has been exceeded. 

The stochastic simulation algorithm consists with the chemical master equation 
in the sense that (fTTj) and (j52p are exact consequence of the propensity func- 
tion This algorithm numerically simulation the time evolution of a given 
chemical system, and gives a sample trajectory of the real system. 

5.2 Tau- leaping algorithm 

In the derivation of the chemical Langevin equation, we can always select r 
small enough such that the the following leap condition is satisfied: 

Leap condition: During [t,t + t), no propensity function is likely 
to change its value by a significant amount. 

Consequently, the previous arguments indicate that we can approximately leap 
the system with a time r by taking 



where x = X(t). 

The equation (j55|) is the basic of the tau-leaping algorithm [JLQ^ : Starting from 
the current state x, we first choose a value r that satisfies the leap condition. 
Next, we generate for each j a random number kj according to Poisson distri- 



and increase the time by r. 

There are two practical issues need to be resolved in order to effectively apply 
the tau-leaping algorithm: First, how can we estimate the largest value of r that 
satisfies the leap condition? Second, how can we ensure that the generated kj 
values do not result in negative populations? 

The original estimation of the largest value of r was given by Gillespie and 
is sketched below [TO]. If r satisfies the leap condition, from ([55]) . the average 
state changes over time r is 



m) 



M 




(55) 




M 



where 



M 




(56) 
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is the state change per unit time. Therefore, the average difference in the 
propensity function aj(x) is given by |aj(x + A) — aj(x)|, which, according to 
the leap condition, should satisfies 

|a,(x + A)-a,(x)| <eao(x), (j = l,---,M), (57) 

where < £ <C 1, and 

M 

ao(x) = ^aj(x). (58) 
i=i 

Since 

\aj (x + A) - a, (x) I « A • Va^ (x) = r^, (x) . 

Let 

^.(x) = (59) 



(j57l) can be approximated by 



N 



T|^e:(x)&j»(x)| <£ao(x), (j = !,■■■ ,M), 



which yields 



N 



r<eao(x)/|5]6(x)6,»(x)|. 
Thus, the largest value of r is given by 

.e[i.M] [ ^ J 

In [21] and [6], two successive refinements were made. The latest r-selection 
procedure given in [S] is more accurate, easier to code, and faster to execute 
than the earlier procedures, but logically more complicated. 

If the r value generated above is much larger than the time required for the 
stochastic simulation algorithm, this approximate procedure will be faster than 
the exact stochastic simulation algorithm. However, it r turns out to be less 
than a few multiples of the time required for the stochastic simulation algorithm 
to make an exact time step (in an order of l/ao(x)), it would be better to use 
the stochastic simulation algorithm instead. 

To avoid the negative populations in tau-leaping, several strategies have been 
proposed, in which the unbounded Poisson random numbers kj are replaced by 
bonded binormal random numbers [71153). In 2005, Cao et al. [3] proposed a new 
approach to resolve this difficulty. In this new Poisson tau-leaping procedure, 
the reaction channels are separated into two classes: critical reactions that may 
exhaust one of its reactants after some firings, and noncritical reactions other 
wise. Next, the noncritical reactions are handled by the regular tau-leaping 
method to obtain a leap time r'. And apply the exact stochastic simulation 
algorithm to the critical reactions, which gives the time r" and the index jc of 
the next critical reaction. The actual time step t is then taken to be the smaller 
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of t' and r". If the former, no critical reactions fires, and if the latter, only one 
critical reaction Rj^ fires. 

Below are main steps for the modified tau-leaping procedure that can avoid 
negative populations (refer [3] for details) 

1. Initialization: Let X = xq, and t — to. 

2. Identify the critical reactions: Identify the reaction channels Rj for 
which aj(X) > and may exhaust one of its reactants after some firings. 

3. Calculate the leap time: Compute the largest leap time r' for the 
noncritical reactions. 

4. Monte Carlo step: Generate {t" ,jc) for the next critical reaction ac- 
cording to the modified density function according to (j52p . 

5. Determine next step: 

(a) If r' < t": Take t = t' . For all critical reactions set kj = 0. 
For all the noncritical reactions Rj , generate kj as a Poisson random 
variable with mean aj(X)T. 

(b) If r" < t': Take t = r". Set kj^ = 1, and for all other critical 
reactions set kj = 0. For all the noncritical reactions Rj, generate kj 
as a Poisson random variable with mean aj(X)r. 

6. Update: Increase the time by r, and replace the molecule count by X + 

7. Iterate: Go back to Step 2 unless the simulation time has been exceeded. 
5.3 Other simulation methods 

In additional to the prominent approximate acceleration procedure tau-leaping, 
there are some other strategies that tend to speedup the stochastic simulation 
algorithm. Here we briefly outline two of the most promising methods. 

Many real systems in biological processes involve chemical reactions with dif- 
ferent time scales, "fast" reactions fire very much more frequently than "slow" ones 
Procedures to handle such systems often involve a stochastic generalization of 
the quasi-steady-state assumption or partial (rapid) equilibrium methods of 
deterministic chemical kinetics [51 \S\ [231 HSl HZ] • The slow-scale stochastic sim- 
ulation algorithm (ssSSA) (or multiscale stochastic simulation algorithm) is a 
systematic procedure for partitioning the system into fast and slow reactions, 
and only simulate the slow reactions by specially modified propensity functions 
[2l[l[5]. 

Another approach to simulate multiscale chemical reaction systems include 
different kinds of Hybrid methods [H [T31 [Ml US] ■ Hybrid methods combine the 
deterministic reaction rate equation with the stochastic simulation algorithm. 
The idea is to split the system into two regimes: the continuous regime of 
large molecule population species, and the discrete regime of small molecule 
population species. The continuous regimes is treated by ordinary differential 
equations, while the discrete regime is simulated by the stochastic simulation 
algorithm. Hybrid methods efficiently utilize the multiscale properties of the 
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system. However, because of lacking a rigorous theoretical foundation, there 
are still many unsolved problems [S^ . 

There are many approaches trying to find the numerical solution of the 
chemical master equation [3 [TTl [T21 HSJ I^El UHl 1301 EZ] ■ In addition, numerical 
methods for the Langevin equation have been well documented (refer [31] for 
example). We will not get into these two subjects here. 

6 Summary 

In modeling a well-stirred chemical reacting systems, the chemical master equa- 
tion provides an 'exact' description of the time evolution of the states. However, 
it is difficulty to directly study the chemical master equation because of the di- 
mension problem. Several approximations are therefore developed, including the 
Fokker-Plank equation, reaction rate equation, and chemical Langevin equation. 
The reaction rate equation is widely used when fluctuations are not important. 
When the fluctuations are signiflcant, the chemical Langevin equation can pro- 
vide reasonable description for the statistical properties of the kinetics, despite 
the conditions in deriving the chemical Langevin equation may not hold. 

When there are noise perturbations to the kinetic parameters, there is no 
simple way to model the system dynamics because the time dependence of envi- 
ronment variables can be very complicated. In a particular case, we can replace 
the reaction rates by log-normal random variables, and generalize the reaction 
rate equation or the chemical Langevin equation to describe the dynamics of a 
chemical system with extrinsic noise. 

Stochastic simulation algorithm (SSA) is an 'exact' numerical simulation 
that shares the same fundamental basis as the chemical master equation. The 
approximate explicit tau-leaping produce, on the other hand, is closely relate 
to the chemical Langevin equation. The robustness and efficiency of the two 
methods have been considerable improved in recent years, and these procedures 
seems to be nearing maturity [5D]. In the last few years, some other strategies 
have been developed for simulating the systems that are dynamically stiff [201 

We conclude this paper with Figure SI which summarizes the theoretical 
structure of stochastic modeling for chemical kinetics (also refer ^20)). 
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